This document illustrate how we can improve ShinyApp Style, and performance. The assignment consists of developing a shiny app that tracks encounter animals and plants species in the world. The dataset comes from two csv large files (4G, 2.4G) which can not open with an ordinary computer.

To demonstrate how we can improve recativities and play around skilling shiny app, We will walk through survey sections and try to site concrete example.

library(remotes)
library(tictoc)
library(RSQLite)
library(tidyverse)
library(geojson)
library(geojsonio)
library(data.table)
library(profvis)
library(DT)
library(promises)
library(future)

1 Shiny Framework

1.1 Shiny: Building complete shiny apps (end to end)

The Shiny App named biodiversity I that developed during assignment, is a full R package that responds to CRAN/Bioconductor criteria. It is named biodiversity. It uses a map with multiple layers. The species are marked with different color in the map. User can view/hide any kingdom, search any species using keywords and the app return matched species and focus on select one.

To install and run biodiversity shiny app, just run these code

#install_github("kmezhoud/biodiversity")
library(biodiversity)
biodiversity::biodiversity()

Screenshot of biodiversity : (demo)

1.2 Shiny Performance: Making Shiny apps fast(er) and scalable

1.2.1 Loading data

As we said, the dataset is two csv files with about 6.5G. The first option to deal with this, is to convert the csv files to sqlite database for example.

1.2.1.1 Use Database instead any files formats: csv, rds, …

In reality, using DB instead csv file is not enought to scale the App to 1000s of users. It is importante use Faster function, like here:

1.2.1.2 Use Faster Function: apply family versus looping

con <-  DBI::dbConnect(RSQLite::SQLite(), "../../../DATA/Concours/Appsilon/biodiversity/biodiversity/inst/biodiversity/extdata/biodiversity.db") 

countries_list <- NULL
tic(msg = "###  LOOPING PROCESS  ###")
for (i in DBI::dbListTables(con)){
    countries_list[[i]] <- tbl(con,i) %>% as_tibble() 
  }
toc()
## ###  LOOPING PROCESS  ###: 4.46 sec elapsed
countries_list <- NULL
tic(msg = "### LAPPLY PROCESS  ###")
countries_list <- lapply(DBI::dbListTables(con), function(x) tbl(con,x) %>% as_tibble)
toc()
## ### LAPPLY PROCESS  ###: 3.828 sec elapsed

In general it is better to use apply family function instead looping.

1.2.1.3 Relief the package from extra data

In our case, we need to load geo.json file map. Mainly, this kid of files have a metadata that can to become heavier. Here, we compared to file with different metadata countain. The first one is heavy and located at /extdata folder, and the other loaded directly from the url source.

##  Load Map SLOW  
tic(msg = "###  From File  ###")
countries_map <- geojson_read("../../../DATA/Concours/Appsilon/countries.geojson", what = "sp")
toc()
## ###  From File  ###: 4.111 sec elapsed
  ##  Load map FASTER
tic(msg = "###  From link  ###")
  countries_map <- geojson_read("https://raw.githubusercontent.com/johan/world.geo.json/master/countries.geo.json", what = "sp")
  toc()
## ###  From link  ###: 0.663 sec elapsed

Nice! In the next steps we will improve the map loading by Caching.

1.2.2 Pre-processing

1.2.2.1 Filtering Manner at the top of processing

We can compare multiple ways to filter our data. Here we tried three methods (grepl, %in%, str_detect, data.table, and DT) and we will select the faster one.

 set.seed(34)
countries <-  c("Poland", "Switzerland")
tic("###  USING grepl  ###")
 biodiversity_data<- countries_list %>% 
                     rbindlist() %>%
                     filter(grepl(paste0(countries, collapse = "|"), country, ignore.case = TRUE))

toc()
## ###  USING grepl  ###: 1.987 sec elapsed
tic("###  USING %in%  ###")
biodiversity_data <- countries_list %>% 
                     rbindlist() %>%
                     filter( country %in% countries )

toc()
## ###  USING %in%  ###: 0.754 sec elapsed
tic("###  USING str_detect  ###")

biodiversity_data <- countries_list %>% 
                     rbindlist() %>%
                     filter( str_detect(country, countries))
     
toc()
## ###  USING str_detect  ###: 0.799 sec elapsed
tic("###  USING data.table  ###")

biodiversity_data <- countries_list %>%
                     rbindlist() %>% as.data.table()

 biodiversity_data <-    biodiversity_data[country == paste0(countries, collapse ="|")]
toc()
## ###  USING data.table  ###: 1.246 sec elapsed
tic("###  USING DT  ###")
biodiversity_data <- countries_list %>%
                     rbindlist() %>% as.data.table()

 biodiversity_data <-  biodiversity_data[grepl(paste0(countries, collapse ="|"), country)]

 toc()
## ###  USING DT  ###: 1.567 sec elapsed

1.2.3 Profiling

While adding countries to the database, the app become slow, mainly during ploting the map. The bottleneck become bigger when there are more and more circles markers (encounters) to plot on the map.

Note

In the case of biodiversity package, the loading inputs (Map and Tables), the Processing, and the building of the map with circles seems not to be the bottleneck. It should be the displaying process of the screen. We will process to analyse this hypothesis. We will use profvis package to screen when the code takes a lot of memory.

profvis::profvis({biodiversity::biodiversity()})
## Loading required package: shiny
## 
## Attaching package: 'shiny'
## The following objects are masked from 'package:DT':
## 
##     dataTableOutput, renderDataTable
## The following object is masked from 'package:geojsonio':
## 
##     validate
## sourcing frontPage: 0.732 sec elapsed
## sourcing frontPage_ui: 0.005 sec elapsed
## [1] "NEW QUERY OF: Poland"
## Loading Map for Poland: 0.276 sec elapsed
## Loading Table of Poland: 0.218 sec elapsed
## data Processing of Poland: 0.712 sec elapsed
## Building the Map of Poland: 0.891 sec elapsed
## [1] "The biodiversity App is closed."

The blue color in the profiling indicates that the output$wotldMap is the most heavy computing.

We will try to Analyze thoroughly using tictoc.

1.2.4 Catching repeated process during plotting

User can select a country to focus on and search for species. When the user wants to iterate the choice of the country, a popup appears and wait for input country. Here we can catch previous plot and wait if the user/users (session/application level) reselect the same country, then the app can return it without computing.

1.2.4.1 bindCache() of renderleaflet()

We just added %>% bindCache(vals$countries, cache = "session") to the end of renderleaflet().

vals <- reactiveValues(countries = NULL) 

  ## Listening OK button
  observeEvent(input$ok, {
    if (!is.null(input$countries_id) && nzchar(input$countries_id)) {
      vals$countries <- input$countries_id
      removeModal()
    } else {
      showModal(popupModal(failed = TRUE))
    }
  })
  
  
output$worldMap <- renderLeaflet({
     ...
      for (i in input$countries_id){
      #Put each table in the list, one by one
      table_list[[i]] <- tbl(con,i) %>% as_tibble()
      
      ...
    }
  
}) %>% bindCache(vals$countries, cache = "session")

Note

Adding %>% bindCache(vals$countries, cache = "session") to the rendereaflet(), we catche its all processes ran with a specific vals$countries. If the user select a new vals$countries, le computing will be done, if not the app returns catched object. Concretely, we do not view progress bar for loading data, preprocessing and ploting the map.

We used a caching at session level. We can generalize the cache at the application level and memorize the object for multiple user by setting %>% bindCache(vals$countries, cache = "app").

library(biodiversityBindCache)
biodiversityBindCache::biodiversityBindCache()
## sourcing frontPage: 0.235 sec elapsed
## sourcing frontPage_ui: 0.028 sec elapsed
## [1] "NEW QUERY OF: Poland"
## Loading Map for Poland: 0.257 sec elapsed
## Loading Table of Poland: 0.212 sec elapsed
## data Processing of Poland: 0.644 sec elapsed
## Building the Map of Poland: 0.857 sec elapsed
## [1] "NEW QUERY OF: Poland"
## [1] "NEW QUERY OF: Switzerland"
## Loading Map for Switzerland: 0.258 sec elapsed
## Loading Table of Switzerland: 0.469 sec elapsed
## data Processing of Switzerland: 1.536 sec elapsed
## Building the Map of Switzerland: 1.772 sec elapsed
## [1] "NEW QUERY OF: Switzerland"
## [1] "NEW QUERY OF: Poland"
## [1] "The biodiversity App is closed."

Steps of this demo: Query Poland, Poland, Switzerland, Switzerland, Poland.

Only the first query on each country has computing steps.

You can see the Loading Map for … is done for Poland and Switzerland which is not necessary. In the next step, we will cache it by memoise.

1.2.4.2 Memoise functions

The same result can be obtained by using memoise function. Here we will generate a BAD function inside renderLeaflet just to see how it works.

output$worldMap <- renderLeaflet({
  
  leaflet_fun <- function(countries_id){
     ...
      for (i in input$countries_id){
      #Put each table in the list, one by one
      table_list[[i]] <- tbl(con,i) %>% as_tibble()
      
      ...
      }
    
  }

## memorize at the session level
m_leaflet_fun <- memoise::memoise(leaflet_fun, cache = session$cache)
  
}) 

m_leaflet_fun(input$countries_id)

Note

The argument used in memoise function is the input country (selected country). In this case, all processing will run only if a new country was selected by user. In the other hand, like in previous example bindCache(), each country will be proceed only the first time.

library(biodiversityMemoise)
biodiversityMemoise::biodiversityMemoise()
## sourcing frontPage: 0.1 sec elapsed
## sourcing frontPage_ui: 0.016 sec elapsed
## [1] "NEW QUERY OF: Poland"
## Loading Map for Poland: 0.421 sec elapsed
## Loading Table of Poland: 0.246 sec elapsed
## data Processing of Poland: 0.866 sec elapsed
## Building the Map of Poland: 0.756 sec elapsed
## [1] "NEW QUERY OF: Poland"
## [1] "NEW QUERY OF: Switzerland"
## Loading Map for Switzerland: 0.001 sec elapsed
## Loading Table of Switzerland: 0.44 sec elapsed
## data Processing of Switzerland: 1.447 sec elapsed
## Building the Map of Switzerland: 2.239 sec elapsed
## [1] "NEW QUERY OF: Switzerland"
## [1] "NEW QUERY OF: Poland"
## [1] "The biodiversity App is closed."

Steps: Query Poland, Poland, Switzerland, Switzerland, Poland.

Only the first query on each country has computing steps.

To memoize the process at Application level we need to change the argument cache by m_leaflet_fun <- memoise::memoise(leaflet_fun, cache = getShinyOption("cache"))

Reading geojson map takes a while to load. And, it is reloaded in each country which is not necessary. To memoize the map we can do this:

    geojson_read_fun <- function(url){
      
      #withProgress(message = 'Loading Map ...', value = 20, {
        
        ##  Load Map source : https://datahub.io/core/geo-countries#r
        #countries_map <- geojson_read("extdata/countries.geojson", what = "sp")
        ##  Load map Faster
        geojson_read(url, what = "sp")
  
      #})
    }
    
    m_geojson_read_fun <- memoise::memoise(geojson_read_fun)
  tic(paste0("Loading Map for ", "Poland"))
  countries_map <- m_geojson_read_fun("https://raw.githubusercontent.com/johan/world.geo.json/master/countries.geo.json")
  toc()
## Loading Map for Poland: 0.341 sec elapsed

Note

Nice! If you can see in the last run of biodiversityMemoise::biodiversityMemoise(), the Loading Map… was consumed about 4.6~sec. The following query have elapsed time 0s.

1.2.5 Asynchronous programming

Actually, I did not find a way to use promises and future to improve loading data or renderLeaflet. Since integrating promises to shiny is used generally within outputs, reactive expressions, and observers.

In our case, renderLeaflet is an exception of shiny Outputs. it exepects a value like renderText() or renderPlot() but needs two data type to plot the map (geo.json) and to enrich it with biodiversity data.

We can improve loading data or map, but we can not use them as a promise class. It take a long time to convert them. like:

  tic("NORMAl Loding the Map " )
    #countries_map <- geojson_read("https://raw.githubusercontent.com/johan/world.geo.json/master/countries.geo.json",what = "sp")
    countries_map <- geojson_read("../../../DATA/Concours/Appsilon/countries.geojson", what = "sp")
  toc()
## NORMAl Loding the Map : 3.151 sec elapsed
  tic("FUTURE Loading the Map " )
  countries_map <- future({
    #countries_map <- geojson_read("https://raw.githubusercontent.com/johan/world.geo.json/master/countries.geo.json",what = "sp")
    geojson_read("../../../DATA/Concours/Appsilon/countries.geojson", what = "sp")
  })
  toc()
## FUTURE Loading the Map : 3.63 sec elapsed
  tic("convert Map FUTURE using value()")
  countries_map <- value(countries_map)
  toc()
## convert Map FUTURE using value(): 0.003 sec elapsed
  tic("NORMAL Load data")
  biodiversity_data <-   read_rds("../../../DATA/Concours/Appsilon/biodiversity-data/full_data_Poland_Switzerland_Germany_France_Spain_USA.rds")
  toc()
## NORMAL Load data: 11.128 sec elapsed
  tic("FUTURE Load data")
  biodiversity_data_fu <-future({
    read_rds("../../../DATA/Concours/Appsilon/biodiversity-data/full_data_Poland_Switzerland_Germany_France_Spain_USA.rds")
  })
  toc()
## FUTURE Load data: 7.562 sec elapsed
  tic("Converting data FUTURE using Value()")
  biodiversity_data <- value(biodiversity_data_fu)
  print(paste0("Table dimension: ", dim(biodiversity_data)))
## [1] "Table dimension: 404411" "Table dimension: 15"
  toc()
## Converting data FUTURE using Value(): 0.003 sec elapsed

1.3 Shiny Beautiful UI: Making Shiny apps (more) beautiful

Here an example of how we can play with css and js files to improve the beauty of Shiny App. Also we can add beauty documentation using markdown or Rmarkdown.

shinyUI(fluidPage(theme = shinytheme("flatly"), title = "Biodiversity", #superhero, flatly
                    
                  # Add CSS files
                  includeCSS(path = "www/AdminLTE.css"),
                  includeCSS(path = "www/shinydashboard.css"),
                  tags$head(includeCSS("www/styles.css")),
                  ## Include Appsilon logo at the right of the navbarPage
                  tags$head(tags$script(type="text/javascript", src = "logo.js" )),
                  ## Include Biodiversity logo
                  navbarPage(title=div(img(src="biodiversity.png", height = "50px", widht = "50px",
                                           style = "position: relative; top: -14px; right: 1px;"),
                                       "Biodiversity"),
                             
                             
                             tabPanel("Globe",icon = icon('globe'),
                                      div(class="outer",
                                          tags$head(includeCSS("www/styles.css")),
                                          
                                          uiOutput('ui_frontPage')
                                      )),
                             
                             navbarMenu("", icon = icon("question-circle"),
                                        tabPanel("About",icon = icon("info"),
                                                 withMathJax(includeMarkdown("extdata/help/about.md"))
                                        ),
                                        tabPanel("Performance",icon = icon("creative-commons-sampling"),
                                                 withMathJax(includeMarkdown("extdata/help/performance.md"))
                                        ),
                                        tabPanel("Help",  icon = icon("question"),
                                                 withMathJax(includeMarkdown("extdata/help/help.md"))), 
                                        tabPanel(tags$a(
                                          "", href = "https://github.com/kmezhoud/biodiversity/issues", target = "_blank",
                                          list(icon("github"), "Report issue")
                                        )),
                                        tabPanel(tags$a(
                                          "", href = "https://github.com/kmezhoud/biodiversity", target = "_blank",
                                          list(icon("globe"), "Resources")
                                        ))
                             )
                   
                                       
                  )
))

We can add a transparent layer with button, collapsable table and plotly. Like this piece of code to generate transparent panel with button over the Map:

    column(width = 12,#style='height:200px',
           div(class="outer",
               tags$head(includeCSS("www/styles.css")),
               leafletOutput("worldMap", height = "600px"),
               absolutePanel(id = "panel_id", class = "panel panel-default",
                             top = 300, left = 20, width = 45, fixed=FALSE,
                             draggable = TRUE, height =  45 ,
                             # Make the absolutePanla collapsable
                             #HTML('<button data-toggle="collapse" data-target="#popup_id">Country</button> '),
                             #tags$div(id = 'popup_id',  class="collapse",#style='background-color:transparent; border-color: transparent',
                             div(actionButton(inputId = "popup_id",label = "",
                                               icon = icon("globe"),
                                               style='background-color:transparent; border-color: transparent',),
                                 style = "font-size:100%") 
                             #)
               ),
               
               
           )
    )

1.4 Javascript for Shiny: Extending Shiny with advanced functionalities

I can site a wrapper of zxing JS library to shinyApp. The goal is to read barcode using smartphone/laptop camera link. demo

The demo shows a complex DT table with multiple gadgets:

1- SelectInput in each row

2- buttons in each row to run barcode reader using camera.

user can scroll serial number or scan barcode to match the serial number. Each button is recognize its row_id.

Scrolling Serial Number Scan barcode: The app return serial number of the barcode

Below the image, the gadget returns the serial number of scanned barcode. I added a sound signal BIP to inform the user that is OK. The returned serial number goes automatically to specific cell in the DT.

2 Generic key skills

2.1 Software Testing: Delivering high quality solutions thanks to testing

Mainly, shiny Apps that I developed are a full R package with:

1- build, test, check processes during development

2- Functions documents

3- Vignette for user

If the task is a part of a big project hosted in github:

1- Clone the github:branch

2- Make changes and features for the solution

3- Build, Test, Check

4- Pull request

2.2 Quick Prototyping: Quick prototyping in at least one technology

I’m using my background and examples that I developed during previous years.

Also I can use useful packages like: geolm, shinydashboard, shiny.sementic, shinymobile to accelerate prototyping.

3 Fullstack

3.2 Fullstack Javascript

I try to wrap JS library to shiny for better performance.

Case study: A wrapper of zxing JS library to shinyApp to read barcode using camera. (github), (demo)

4 Deployment

4.1 Shiny Infrastructure: Knowledge of infrastructure solutions for Shiny

Local infrastructure

The infrastructure of shiny app is like R package. Maybe, to give support, please see the infrastructure of shiny app package develeped during assignment step (private but appsilon-hiring is collaborator) .

Deployment purpose:

1- When using shiny server (using local or cloud ubuntu server), the app can be installed as a package or deployed at /srv/shiny-server/app_name.

2- As a docker image: docker run -d -p 3838:3838 kmezhoud/biodiversity:0.1

3- As a kubernetes image

Deployment setting:

1- nginx or traefik

2- Port and redirection

3- firewall, iptable, and email alert

4- reverse proxy, and SSL domain certificate

5- keyring sensitive code like: login detail, IP address, tables and column names of the database.

I can site an example of shiny mobile app named mobi100c. It is developed to make easy the ordering processes between customers, retailers, and Stores with central Azure database and different login profiles. (login:WASSIM, pwd:1)

4.2 RStudio Products

I am familiar with RStudio and RStudio server. I can install and use Rstudio server at digitalOcean.

4.3 AWS infrastructure

I suppose like digialOcean.

4.4 Google Cloud infrastructure

I suppose like digialOcean.

4.5 Azure infrastructure

I suppose like digialOcean.

5 Machine Learning

1- Flood Prediction In Malawi

2- Biological Activities prediction of new chemical compound

3- Dealing with unbalanced data in machine learning

4- Survival Lung Cancer Event

5- Instacart Market Basket Analysis - APRIORI algorithme

6- Fraud transaction prediction